Investigating Prosodic Modifications for Polyglot Text-to-Speech Synthesis
نویسندگان
چکیده
This paper investigates the need for applying English prosody when synthesising English portions of mixed English/German texts using a German-based polyglot text-to-speech (TTS) synthesis system. The polyglot system is based on a monolingual German TTS system, which uses a phone mapping from English to German to synthesise English texts. Two systems with varying degrees of assimilation to English are compared, one in which prosody is derived from the German monolingual system, and one in which the prosody is derived from an equivalent English monolingual system. The naturalness of the different prosody approaches and overall intelligibility and acceptability of the polyglot systems is assessed by native bi-lingual speakers of both English and German, on German texts containing varying lengths of English inclusions, and on complete texts in English. The results show that both German and English subjects preferred English prosody for longer English inclusions or complete English texts, but had no preference for short inclusions.
منابع مشابه
From multilingual to polyglot speech synthesis
This paper proposes a distinction between existing multilingual synthesis systems and mixed-lingual or polyglot synthesis systems. The latter should be capable of synthesising with the same voice utterances which contain foreign language words or word groups. As a first step towards polyglot synthetic speech, the design and realisation of a 4-lingual single-speaker diphone inventory is detailed...
متن کاملPolyglot speech prosody control
Within a polyglot text-to-speech synthesis system, the generation of an adequate prosody for mixed-lingual texts, sentences, or even words, requires a polyglot prosody model that is able to seamlessly switch between languages and that applies the same voice for all languages. This paper presents the first polyglot prosody model that fulfills these requirements and that is constructed from indep...
متن کاملA Mixed-Lingual Phonological Component Which Drives the Statistical Prosody Control of a Polyglot TTS Synthesis System
A polyglot text-to-speech synthesis system which is able to read aloud mixed-lingual text has first of all to derive the correct pronunciation. This is achieved with an accurate morpho-syntactic analyzer that works simultaneously as language detector, followed by a phonological component which performs various phonological transformations. The result of these symbol processing steps is a comple...
متن کاملThe ESPRIT Project POLYGLOT
The ESPRIT project POLYGLOT aims at developing multi-lingual Speech-to-Text and Text-to-Speech conversion and to integrate this technology in a number of commercially viable prototype applications. Speech-to-Text conversion is mainly concerned with very large vocabulary isolated word recognition. It uses a statistical knowledge based approach that was pioneered for Italian and is now being exte...
متن کاملNative Language Identification Based on English Accent
Present work is aimed at investigating the influence of mother tongue (L1) of a South Indian speaker on a second language (L2). Second language can be a dominant local language, national language in India i.e., Hindi or a connecting language English. In the current study, L2 is a short discourse in English. Cepstral and prosodic features were used as in Language Identification (LID) to distingu...
متن کامل